最近的研究侧重于制定流量预测作为一种时空图形建模问题。它们通常在每个时间步骤构造静态空间图,然后将每个节点连接在相邻时间步骤之间以构造时空图形。在这样的图形中,不同时间步骤的不同节点之间的相关性未明确地反映,这可以限制图形神经网络的学习能力。同时,这些模型在不同时间步骤中使用相同的邻接矩阵时,忽略节点之间的动态时空相关性。为了克服这些限制,我们提出了一种时空关节图卷积网络(StJGCN),用于交通预测在公路网络上的几个时间上限。具体地,我们在任何两个时间步长之间构造预定的和自适应时空关节图(STJG),这代表了全面和动态的时空相关性。我们进一步设计了STJG上的扩张因果时空关节图卷积层,以捕获与多个范围不同的视角的时空依赖关系。提出了一种多范围注意机制来聚合不同范围的信息。四个公共交通数据集的实验表明,STJGCN是计算的高效和优于11个最先进的基线方法。
translated by 谷歌翻译
本文旨在统一非欧几里得空间中的空间依赖性和时间依赖性,同时捕获流量数据的内部空间依赖性。对于具有拓扑结构的时空属性实体,时空是连续的和统一的,而每个节点的当前状态都受到每个邻居的变异时期的邻居的过去状态的影响。大多数用于流量预测研究的空间依赖性和时间相关性的空间神经网络在处理中分别损害了时空完整性,而忽略了邻居节点的时间依赖期可以延迟和动态的事实。为了建模这种实际条件,我们提出了一种新型的空间 - 周期性图神经网络,将空间和时间视为不可分割的整体,以挖掘时空图,同时通过消息传播机制利用每个节点的发展时空依赖性。进行消融和参数研究的实验已经验证了拟议的遍及术的有效性,并且可以从https://github.com/nnzhan/traversenet中找到详细的实现。
translated by 谷歌翻译
图形卷积网络对于从图形结构数据进行深入学习而变得必不可少。大多数现有的图形卷积网络都有两个大缺点。首先,它们本质上是低通滤波器,因此忽略了图形信号的潜在有用的中和高频带。其次,固定了现有图卷积过滤器的带宽。图形卷积过滤器的参数仅转换图输入而不更改图形卷积滤波器函数的曲率。实际上,除非我们有专家领域知识,否则我们不确定是否应该在某个点保留或切断频率。在本文中,我们建议自动图形卷积网络(AUTOGCN)捕获图形信号的完整范围,并自动更新图形卷积过滤器的带宽。虽然它基于图谱理论,但我们的自动环境也位于空间中,并具有空间形式。实验结果表明,AutoGCN比仅充当低通滤波器的基线方法实现了显着改善。
translated by 谷歌翻译
Modeling multivariate time series has long been a subject that has attracted researchers from a diverse range of fields including economics, finance, and traffic. A basic assumption behind multivariate time series forecasting is that its variables depend on one another but, upon looking closely, it's fair to say that existing methods fail to fully exploit latent spatial dependencies between pairs of variables. In recent years, meanwhile, graph neural networks (GNNs) have shown high capability in handling relational dependencies. GNNs require well-defined graph structures for information propagation which means they cannot be applied directly for multivariate time series where the dependencies are not known in advance. In this paper, we propose a general graph neural network framework designed specifically for multivariate time series data. Our approach automatically extracts the uni-directed relations among variables through a graph learning module, into which external knowledge like variable attributes can be easily integrated. A novel mix-hop propagation layer and a dilated inception layer are further proposed to capture the spatial and temporal dependencies within the time series. The graph learning, graph convolution, and temporal convolution modules are jointly learned in an end-to-end framework. Experimental results show that our proposed model outperforms the state-of-the-art baseline methods on 3 of 4 benchmark datasets and achieves on-par performance with other approaches on two traffic datasets which provide extra structural information. CCS CONCEPTS• Computing methodologies → Neural networks; Artificial intelligence.
translated by 谷歌翻译
Spatial-temporal graph modeling is an important task to analyze the spatial relations and temporal trends of components in a system. Existing approaches mostly capture the spatial dependency on a fixed graph structure, assuming that the underlying relation between entities is pre-determined. However, the explicit graph structure (relation) does not necessarily reflect the true dependency and genuine relation may be missing due to the incomplete connections in the data. Furthermore, existing methods are ineffective to capture the temporal trends as the RNNs or CNNs employed in these methods cannot capture long-range temporal sequences. To overcome these limitations, we propose in this paper a novel graph neural network architecture, Graph WaveNet, for spatial-temporal graph modeling. By developing a novel adaptive dependency matrix and learn it through node embedding, our model can precisely capture the hidden spatial dependency in the data. With a stacked dilated 1D convolution component whose receptive field grows exponentially as the number of layers increases, Graph WaveNet is able to handle very long sequences. These two components are integrated seamlessly in a unified framework and the whole framework is learned in an end-to-end manner. Experimental results on two public traffic network datasets, METR-LA and PEMS-BAY, demonstrate the superior performance of our algorithm.
translated by 谷歌翻译
Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the Euclidean space. However, there is an increasing number of applications where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. The complexity of graph data has imposed significant challenges on existing machine learning algorithms. Recently, many studies on extending deep learning approaches for graph data have emerged. In this survey, we provide a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields. We propose a new taxonomy to divide the state-of-the-art graph neural networks into four categories, namely recurrent graph neural networks, convolutional graph neural networks, graph autoencoders, and spatial-temporal graph neural networks. We further discuss the applications of graph neural networks across various domains and summarize the open source codes, benchmark data sets, and model evaluation of graph neural networks. Finally, we propose potential research directions in this rapidly growing field.
translated by 谷歌翻译
由于其在线社交网络上的广泛应用,影响力最大化(IM)在过去几十年中引起了广泛关注。当前的IM研究缺乏对种子如何产生影响效应的人类理解的解释,从而降低了现有解决方案的可信度,尽管它们适用。由于IM的复杂性,目前的大多数研究都集中在估计一阶扩散能力上,并且经常考虑从不同种子分散的流量之间的相互作用。这项研究使用SOBOL指数,这是基于方差的灵敏度分析的基石,可以分解对单个种子及其相互作用的影响效果。 SOBOL指数是针对IM上下文量身定制的,通过将种子选择作为二进制变量进行建模。这种说明方法普遍适用于所有网络类型,IM技术和扩散模型。基于解释方法,提出了一个称为Sobolim的一般框架,以通过过度选择节点,然后是消除策略来提高IM研究的性能。关于合成和现实世界图的实验表明,对影响效应的解释可以可靠地识别各种网络和IM方法之间种子之间的关键高阶相互作用。在经验上,Sobolim在有效性和效率上具有优势。
translated by 谷歌翻译
A recent study has shown a phenomenon called neural collapse in that the within-class means of features and the classifier weight vectors converge to the vertices of a simplex equiangular tight frame at the terminal phase of training for classification. In this paper, we explore the corresponding structures of the last-layer feature centers and classifiers in semantic segmentation. Based on our empirical and theoretical analysis, we point out that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes, which breaks the equiangular and maximally separated structure of neural collapse for both feature centers and classifiers. However, such a symmetric structure is beneficial to discrimination for the minor classes. To preserve these advantages, we introduce a regularizer on feature centers to encourage the network to learn features closer to the appealing structure in imbalanced semantic segmentation. Experimental results show that our method can bring significant improvements on both 2D and 3D semantic segmentation benchmarks. Moreover, our method ranks 1st and sets a new record (+6.8% mIoU) on the ScanNet200 test leaderboard. Code will be available at https://github.com/dvlab-research/Imbalanced-Learning.
translated by 谷歌翻译
Weakly-supervised object localization aims to indicate the category as well as the scope of an object in an image given only the image-level labels. Most of the existing works are based on Class Activation Mapping (CAM) and endeavor to enlarge the discriminative area inside the activation map to perceive the whole object, yet ignore the co-occurrence confounder of the object and context (e.g., fish and water), which makes the model inspection hard to distinguish object boundaries. Besides, the use of CAM also brings a dilemma problem that the classification and localization always suffer from a performance gap and can not reach their highest accuracy simultaneously. In this paper, we propose a casual knowledge distillation method, dubbed KD-CI-CAM, to address these two under-explored issues in one go. More specifically, we tackle the co-occurrence context confounder problem via causal intervention (CI), which explores the causalities among image features, contexts, and categories to eliminate the biased object-context entanglement in the class activation maps. Based on the de-biased object feature, we additionally propose a multi-teacher causal distillation framework to balance the absorption of classification knowledge and localization knowledge during model training. Extensive experiments on several benchmarks demonstrate the effectiveness of KD-CI-CAM in learning clear object boundaries from confounding contexts and addressing the dilemma problem between classification and localization performance.
translated by 谷歌翻译
Witnessing the impressive achievements of pre-training techniques on large-scale data in the field of computer vision and natural language processing, we wonder whether this idea could be adapted in a grab-and-go spirit, and mitigate the sample inefficiency problem for visuomotor driving. Given the highly dynamic and variant nature of the input, the visuomotor driving task inherently lacks view and translation invariance, and the visual input contains massive irrelevant information for decision making, resulting in predominant pre-training approaches from general vision less suitable for the autonomous driving task. To this end, we propose PPGeo (Policy Pre-training via Geometric modeling), an intuitive and straightforward fully self-supervised framework curated for the policy pretraining in visuomotor driving. We aim at learning policy representations as a powerful abstraction by modeling 3D geometric scenes on large-scale unlabeled and uncalibrated YouTube driving videos. The proposed PPGeo is performed in two stages to support effective self-supervised training. In the first stage, the geometric modeling framework generates pose and depth predictions simultaneously, with two consecutive frames as input. In the second stage, the visual encoder learns driving policy representation by predicting the future ego-motion and optimizing with the photometric error based on current visual observation only. As such, the pre-trained visual encoder is equipped with rich driving policy related representations and thereby competent for multiple visuomotor driving tasks. Extensive experiments covering a wide span of challenging scenarios have demonstrated the superiority of our proposed approach, where improvements range from 2% to even over 100% with very limited data. Code and models will be available at https://github.com/OpenDriveLab/PPGeo.
translated by 谷歌翻译